# DAQ requirements and datalink development

Mikihiko Nakao (KEK, IPNS) mikihiko.nakao@kek.jp

March 17, 2009 SuperBelle meeting, KEK

### DAQ Mission

#### Record all data

• 30 kHz design maximum L1 trigger rate (now 0.5 kHz) only 50% margin from nominal rate at  $\mathcal{L} = 8 \times 10^{35}$ 

#### Minimize deadtime

- Pipelined digitization, feature extraction
- Existing constraint makes minimum 3% readout deadtime fraction at 30 kHz nothing more should be added
- Deadtime due to continuous injection (2% at 50 Hz inj)

#### Datasize will be larger

- More channels and noise  $\Rightarrow$  100 kB/event? (now 40 kB)
- +PXD data: 100 Gbit/s(!?) raw rate how to reduce?
- Stable and reliable operation
  - Unification of the readout system components (COPPER)
  - Datalink rocketlO
  - Digitization and feature extraction?

### Trigger rate extrapolation



- Belle's trigger rate is dominated by luminosity component  $\Rightarrow 20 \text{ kHz}$  will be the nominal trigger rate for SuperBelle
- Maximum of 30 kHz will be sufficient? two major brickwalls APV25 deadtime and COPPER bandwidth

### Pipeline readout

Two time domains, digitization can be done at either step



#### Key parameters

- Readout clock frequency ( $f_{RCLK}$ )
- Depth of the ring buffer  $(N_{buf})$
- Minimum time interval between two triggers ( $t_{in}$ )
- Time for data transfer to the next stage  $(t_{out})$

#### Example

SVD (APV25):  $f_{\text{RCLK}} = 32 \text{ MHz}$ ,  $N_{\text{buf}} = 5$ ,  $t_{\text{in}} = 210 \text{ ns}$ ,  $t_{\text{out}} = 30 \mu \text{s}$ 

#### M. Friedl

Four parameters ( $f_{RCLK}$ ,  $N_{buf}$ ,  $t_{in}$ ,  $t_{out}$ ) completely determine the deadtime characteristics

- SVD with APV25 readout at 30 kHz
  - 31.8 MHz readout clock (RF/16)  $\Rightarrow$  3.4% deadtime
  - 42.3 MHz readout clock (RF/12)  $\Rightarrow$  0.9% deadtime but the L1 trigger latency has to be 3  $\mu$ s (unacceptable!)
  - No other immediate alternative than using APV25
- FPGA-based outer detectors should work better APV Trigger Simulation (2)



- Min Lost: trigger restriction (1) = too little distance
- FIFO Lost: trigger restriction (2) = too many pending readouts

#### Question to trigger people:

*Can you provide trigger with 200 ns (12 clock at 63 MHz) spacing?* 

- Shorter than CDC drift time, shared hits by two L1 timings
- How about ECL?
- A longer spacing (~500 ns) will add 1% deadtime

#### Question to detector people:

*Can you read out with 200 ns (12 clock at 63 MHz) spacing for up to*  $N_{buf} = 5$  events?

- A longer spacing (~500 ns) will add 1% deadtime
- A smaller  $N_{buf}$  will be fatal for deadtime!

### Slow pipeline readout

Readout time is expected to be slow in ECL (and PXD)

- Minimum buffer separation of 500 ns  $\times$  16 sampling (8  $\mu$ s)
- Two or more trigger have to share the same sample
- For  $t_{in} < 500 \text{ ns}$ , the same samples have to be read out could be separated offline ( $\sigma(t) \sim 100 \text{ ns}$  for 5 MeV)



#### Datasize

- SVD (8kB) twice mroe hits? (16 kB?)
- CDC (5kB) twice more channels (10kB?)
- ECL (10kB) up to 30% occupancy? (12kB?)
- TOP —
- ARICH —
- KLM (4kB)
- PXD —

Need to fill this tabel...

### **COPPER** platform

- Modular structure 4x FINESSE daughter card for readout, 1x PrPMC CPU, 1x trigger receiver
- ~ 200 COPPER2 boards have been used in Belle in place of FASTBUS TDC with TDC-FINESSE (CDC, ACC, TRG, EFC (, KLM))



- PC or COPPER? PCIe would be too advanced, but no reason to start with PCI, and COPPER has a bigger channel density
- COPPER3 board: revised in 2008 for next 10-year lifetime

Heart of SuperBelle DAQ (also constraint to the design)

### **COPPER** issue

- Limitation due to PCIbus
  - ightarrow 1 kB imes 30 kHz would be maximum (need to test again)
  - Data should be trasmitted through the on-CPU GbE link (opposite to the current usage)



- Limitation due to CPU
  - Radisys EPC-6315 has a bottleneck, not usable at 30 kHz
  - New CPU to be developed, with Intel Atom

#### Unification of readout

- Belle has been VERY successful on the Q-to-T concept and a unified LeCroy FASTBUS TDC based readout system
- Similar concept is desireable, even if Q-to-T is unusable
  - unified COPPER platform and unified data link



#### Datalink FINESSE (proposal, to be discussed with IHEP and Hawaii)

- FINESSE interface at COPPER
  - 32-bit FIFO data out + FIFO control
  - 7-bit address, 8-bit data, clock, trigger, status, reset, etc.
- Same set of pins to at remote FPGA remote FINESSE
  - Data transfer and control with 3 Gbps RocketIO
- Same FINESSE board can be used for all detectors
  Feature extraction at FINESSE?



#### Datalink R&D

IHEP group has fabricated a VME6U module with Virtex2pro
 Electrically tested, firmware to be developed soon



### Trigger Timing Distribution (proposal)



- Limited number of fast signal through LVDS(or PECL/CML) 4-pair CAT6 cable with RCLK (readout clock), Trigger, Revolution or other clock, plus 1 reserve line
- Requires one RJ-45 connector on the front-end board (already implemented in the CDC prototype board)
- Plan to make a VME6U module (second version of TT-IO)
- More control through slower RocketIO link (tag, reset, etc)

### Virtex5LXT

#### Is Virtex5LXT a good choice?

- Just RocketIO + FPGA logics, not much unnecessary things
- 3rd generation RocketIO, with nice features like equalization
- Is Virtex5LXT too expensive?
  - XC5V30 costs like 350USD per chip, but if costs much more if we need something larger
  - So far I'm not willing to adopt additional DSP even if it's much cheaper — it requires doubled learning cost.

#### No other choice?

- I don't think it's a good idea to go back to Virtex2pro
- Xilinx has announced Spartan6, should be a cheaper alternative but no details are given yet

## End

#### RocketlO datalink

- RocketIO GTP (3 Gbps), GTX (6.5 Gbps) in Xilinx Virtex5 FPGA
  - 6.5 Gbps: 32-bit every 160 MHz, need to format a larger data packet (2000-bits)
  - 8b10b encoding for safe transfer of 8-bit payload in 10-bit also allows formatting code embedded in data
  - Latency problem >30 clocks (O(1 $\mu$ s)) for en-/decoding
- Asynchronous to the RF clock
  - Has to be driven by a local oscillator
  - CDC/ECL triggers are much slower than the clock cycle



### Trigger Timing Distribution (TTD)

M. Nakao

Trigger and clock to COPPER (mixed system clocks — already working)



### Deadtime-free trigger distribution

- Readout status from frontend through COPPER to TTD
  - RocketIO + serialbus latency  $\sim 1.5 \ \mu s$
  - Status can be embedded in the RocketlO datalink using the "K charactor" of 8b10b encoding
- Pipelined trigger handshake scheme
  - Data integrity (no data-driven FIFO full handling)
  - TTD can issue  $N_{buf}$  (=5) triggers with at least  $t_{in}$  (~ 200 ns) interval before seeing the response



### Continuous injection deadtime

#### Injection noise

- Short component all over the ring
- Long component only in the injected bunch every 10 μs (two components?)

#### Injection veto for the L1 trigger

- 150 µs veto for short component 10% times 2.5ms for long component
- 50 Hz injection  $\Rightarrow$  <u>~2% deadtime</u>

#### Injection effects on PXD?

- Takes 10  $\mu$ s to readout
- Always affected by the long component



### Radiation hardness / magnetic field

#### Need to put FPGAs in the radiation area

- SEU (single event upset) can affect the FPGA configuration memory
- Crucial logic (buffer pointer, event counter, etc) has to be 3-fold redundant (partially damaged data is not a problem)
- Latest FPGA (Virtex5) has ECC of config memory
- Optical transmitter may also get damaged (experiences from BESIII experiment — Z.-A. Liu)
- Series of radiation tests are scheduled
  - '09 March with neutron source, '09 April in KEKB tunnel...
- Magnetic field could be also an issue (CDC readout)
  - RocketIOf power supplies require ferrite filters which will stop working in the magnetic field

### FAQ — why not doing this and that?

#### • What if trigger rate exceeds 30 kHz? Isn't it too optimistic?

- It's not as optimistic as reaching  $8 \times 10^{35}$  :-)
- APV25 may be able to operate with less sampling mode.
  (Other readout system should preferably have a better margin)
- COPPER has to be replaced with something else (e.g., making datalink compatible with Gb Ethernet and receiving with a huge PC cluster)
- We also always have an option to tighten the trigger
- Why do you use COPPER from the beginning? Why not just a PC?
  - COPPER is more compact than a 1U rackmount PC (17 COPPER boards in 9U space)
  - FINESSE is easier to develop than a PCI card
  - We already have 200+ COPPER boards and software

### Physics and background rates

| Physics                         | Cross-  | rate (Hz)    | rate (Hz)             |
|---------------------------------|---------|--------------|-----------------------|
| 1 1 I Y SIC S                   | section | at $10^{34}$ | at $8 \times 10^{35}$ |
| $\Upsilon(4S)$                  | 1.2     | 12           | 960                   |
| $q\overline{q}$                 | 2.8     | 28           | 2200                  |
| $\tau^+\tau^-$                  | 0.8     | 8            | 640                   |
| $\mu^+\mu^-$                    | 0.8     | 8            | 640                   |
| Bhabha (1/100 prescale)         | 44      | 4.4          | 350                   |
| $\gamma\gamma$ (1/100 prescale) | 2.4     | 0.24         | 19                    |
| two-photon ( $p_T > 0.3$ GeV)   | 15      | 35           | 2800                  |
| total                           | 67      | ~100         | ~8000                 |

**Backgrounds** (at Belle,  $\mathcal{L} \sim 1-1.5 \times 10^{34}$  and  $I_{\text{HER}} + I_{\text{LER}} \sim 3$ A, rate~400 Hz)

- Luminosity term (~ 300 Hz) dominant, 2–2.5 times physics rate (Radiative Bhabha hitting endcap? BaBar's problem, not Belle's)
- Constant term is about 100 Hz at total

#### Frontend electronics

|             | sensor + analog       | digitization                |
|-------------|-----------------------|-----------------------------|
| PXD         | DEPFET pixel          | DCD+DHP ASICs(DEPFET group) |
| SVD         | strip (+ APV25)       | flash ADC + FPGA            |
| CDC         | sense wires           | flash ADC + FPGA TDC(?)     |
|             |                       | TARGET ASIC(U-Hawaii) (?)   |
| TOP         | MCP-PMT(?) + CFD      | HP TDC (?)                  |
|             |                       | BLAB2 ASIC(UHawaii) (?)     |
| ARICH       | HAPD(?)               | SA ASIC(JAXA) + FPGA        |
| ECL(barrel) | Csl(Tl) + photo-di.   | Flash ADC(2MHz) + FPGA      |
| ECL(endcap) | Csl(pure) + photo-di. | Flash ADC(42MHz) + FPGA     |
| KLM         | Sci. + Si-PM          | TARGET ASIC(U-Hawaii) (???) |

- Mostly driven by existing technologies constraints to DAQ ASIC developments, commercial flash ADC, heavy use of FPGA
- Unification to some extent is under discussion